-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat(sources/s3): migrate to AWS SDK v2 #4069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Closes #4054. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this! It's been on our list for a bit. Unfortunately, this PR is causing some new integration test failures (in TestSource_Validate
). It looks like there may be some gaps in how it accommodates all the different combinations of buckets and roles that the source allows.
If you're curious, here are the new failures (generated after I modified the test code to actually print them):
=== RUN TestSource_Validate/roles_without_buckets,_all_can_access_at_least_one_account_bucket
2025-04-18T11:29:03-07:00 info-0 context Creating checkpointer {"timeout": 15}
s3_integration_test.go:246:
Error Trace: /Users/cody.rose/src/trufflehog/pkg/sources/s3/s3_integration_test.go:246
Error: Not equal:
expected: 0
actual : 1
Test: TestSource_Validate/roles_without_buckets,_all_can_access_at_least_one_account_bucket
s3_integration_test.go:248: role "arn:aws:iam::619888638459:role/s3-test-assume-role" could not list any s3 buckets for scanning: operation error S3: ListBuckets, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: e554f8c8-a8bb-4aa0-beb5-f83f9b71ba18, api error InvalidClientTokenId: The security token included in the request is invalid.
--- FAIL: TestSource_Validate/roles_without_buckets,_all_can_access_at_least_one_account_bucket (0.62s)
=== RUN TestSource_Validate/role_and_buckets,_can_access_at_least_one_bucket
2025-04-18T11:29:04-07:00 info-0 context Creating checkpointer {"timeout": 15}
s3_integration_test.go:246:
Error Trace: /Users/cody.rose/src/trufflehog/pkg/sources/s3/s3_integration_test.go:246
Error: Not equal:
expected: 0
actual : 3
Test: TestSource_Validate/role_and_buckets,_can_access_at_least_one_bucket
s3_integration_test.go:248: could not get regional client for bucket "truffletestbucket-no-access": could not get s3 region for bucket: truffletestbucket-no-access: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 4aff8536-ba55-4e52-b395-81a0df48042f, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: could not get regional client for bucket "truffletestbucket-s3-role-assumption": could not get s3 region for bucket: truffletestbucket-s3-role-assumption: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 59dea661-dee0-4327-a3a2-a4e71f0010b2, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: role "arn:aws:iam::619888638459:role/s3-test-assume-role" could not list objects in any bucket
--- FAIL: TestSource_Validate/role_and_buckets,_can_access_at_least_one_bucket (0.47s)
2025-04-18T11:29:04-07:00 info-0 context Creating checkpointer {"timeout": 15}
s3_integration_test.go:246:
Error Trace: /Users/cody.rose/src/trufflehog/pkg/sources/s3/s3_integration_test.go:246
Error: Not equal:
expected: 1
actual : 6
Test: TestSource_Validate/roles_and_buckets,_one_error_per_role_that_cannot_access_at_least_one_bucket
s3_integration_test.go:248: could not get regional client for bucket "truffletestbucket-no-access": could not get s3 region for bucket: truffletestbucket-no-access: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: fc4aad7f-7ee5-481d-8e38-0e255888b915, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: could not get regional client for bucket "truffletestbucket-s3-role-assumption": could not get s3 region for bucket: truffletestbucket-s3-role-assumption: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: cbed8d71-4889-4fa5-b924-7776254c7c20, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: role "arn:aws:iam::619888638459:role/s3-test-assume-role" could not list objects in any bucket
s3_integration_test.go:248: could not get regional client for bucket "truffletestbucket-no-access": could not get s3 region for bucket: truffletestbucket-no-access: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 41a29cae-3245-471d-8666-b17010cc5412, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: could not get regional client for bucket "truffletestbucket-s3-role-assumption": could not get s3 region for bucket: truffletestbucket-s3-role-assumption: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 2deb96fb-a2ce-4878-9758-40e661c94080, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: role "arn:aws:iam::619888638459:role/test-no-access" could not list objects in any bucket
--- FAIL: TestSource_Validate/roles_and_buckets,_one_error_per_role_that_cannot_access_at_least_one_bucket (0.97s)
2025-04-18T11:29:05-07:00 info-0 context Creating checkpointer {"timeout": 15}
s3_integration_test.go:246:
Error Trace: /Users/cody.rose/src/trufflehog/pkg/sources/s3/s3_integration_test.go:246
Error: Not equal:
expected: 1
actual : 3
Test: TestSource_Validate/role_and_buckets,_a_bucket_doesn't_even_exist
s3_integration_test.go:248: could not get regional client for bucket "not-a-real-bucket-asljdhmglasjgvklhsdaljfh": could not get s3 region for bucket: not-a-real-bucket-asljdhmglasjgvklhsdaljfh: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 4f7d2f8b-22a7-4f94-89dc-0d7a72f1301a, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: could not get regional client for bucket "truffletestbucket-s3-role-assumption": could not get s3 region for bucket: truffletestbucket-s3-role-assumption: operation error S3: HeadBucket, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 8dc601e7-e091-4bd8-b3f0-bdf0900266bd, api error InvalidClientTokenId: The security token included in the request is invalid.
s3_integration_test.go:248: role "arn:aws:iam::619888638459:role/s3-test-assume-role" could not list objects in any bucket
--- FAIL: TestSource_Validate/role_and_buckets,_a_bucket_doesn't_even_exist (0.48s)
Reference: #4069 (comment) Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
@rosecodym Thank you so much for reviewing this PR! I really appreciate it. Regarding the integration test failures, I believe they are caused by #4069 (comment), which I have addressed in commit a5411bf. When you have a chance, could you please re-run the integration tests to see if this resolves the issue? UPDATE: Another fix applied in commit 2191e18, see #4069 (comment). Note Just to clarify, the following code handles the scenario tested in if roleArn != "" {
// A valid credentials is required to assume IAM role. aws.AnonymousCredentials is not a valid credentials.
// If the value of credsProvider is aws.AnonymousCredentials{} from the above switch-case,
// we will need to set credsProvider to nil to use SDK's default credential chain.
_, isUnauthenticated := credsProvider.(aws.AnonymousCredentials)
if isUnauthenticated {
credsProvider = nil
} The trufflehog/pkg/sources/s3/s3_integration_test.go Lines 224 to 234 in fe041f5
P.S. You might have already noticed, but |
I performed a deeper debugging session and discovered an undocumented behavioral difference between V1’s Specifically, V1’s In the following code examples, I used a fake access key, fake secret key, and a fake role ARN.
|
Reference: #4069 (comment) Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Thank you for the updates and detailed investigation! It looks like your change has fixed the issue with the validation integration test, but I'm still seeing some timeouts when running the tests on your branch. I'm investigating to see whether your PR actually changed anything or this is some ephemera related to how we're testing S3 (which, as you've discovered, isn't perfect). |
@rosecodym Would you mind sharing which tests are timing out and the corresponding error logs? |
Some of them were the tests in However, |
@rosecodym After another debugging session, I believe this issue in the I have submitted the fix in a separate PR: #4048. That one can be merged independently before this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your detailed investigation and work! I have one remaining outstanding question about the behavior before and after this change for a certain combination of options - please see my inline comment.
pkg/sources/s3/s3.go
Outdated
// A valid credentials is required to assume IAM role. aws.AnonymousCredentials is not a valid credentials. | ||
// If the value of credsProvider is aws.AnonymousCredentials{} from the above switch-case, | ||
// we will need to set credsProvider to nil to use SDK's default credential chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the previous behavior if a user explicitly requested an unauthenticated connection and also attempted to assume a role? My reading of the comment preceding this one (in the default
switch case) is that such a combination of options should fail, because the default credentials play should only be used if the user fails to configure any authentication method at all. If I'm correct we should preserve that behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the previous behavior if a user explicitly requested an unauthenticated connection and also attempted to assume a role
We will see an authentication error message with --log-level=5
. See the first screen recording role-without-auth-v1.mp4
below.
Note
From aws/aws-sdk-go-v2#2111 (comment):
Your application needs to get valid credentials first, and then call the STS client to assume the role programmatically, only then you can make API calls with the temporary creds that assumeRole returns.
-
AWS SDK V1
role-without-auth-v1.mp4
-
AWS SDK V2, before commit 8cce87c
role-without-auth-v2-continue.mp4
-
AWS SDK V2, after commit 8cce87c
role-without-auth-v2-break.mp4
Commit 8cce87c fixes an infinite loop issue in the s3.NewListObjectsV2Paginator
pagination. The HasMorePages
does not return false when an error occurs. We have to break
out of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the extra info! But my question was actually about the sourcespb.S3_Unauthenticated
case in the switch
statement above, which is not represented by those screen recordings. In retrospect, I can understand the confusion - the "unauthenticated" case is only available in TruffleHog Enterprise! However, we should still preserve its existing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But my question was actually about the
sourcespb.S3_Unauthenticated
case in the switch statement above, which is not represented by those screen recordings.
Actually the 3 screen recordings above were indeed the case of sourcespb.S3_Unauthenticated
. According to ScanS3
function in pkg/engine/s3.go
:
Lines 19 to 45 in e42153d
connection := &sourcespb.S3{ | |
Credential: &sourcespb.S3_Unauthenticated{}, | |
} | |
if c.CloudCred { | |
if len(c.Key) > 0 || len(c.Secret) > 0 || len(c.SessionToken) > 0 { | |
return sources.JobProgressRef{}, fmt.Errorf("cannot use cloud environment and static credentials together") | |
} | |
connection.Credential = &sourcespb.S3_CloudEnvironment{} | |
} | |
if len(c.Key) > 0 && len(c.Secret) > 0 { | |
if len(c.SessionToken) > 0 { | |
connection.Credential = &sourcespb.S3_SessionToken{ | |
SessionToken: &credentialspb.AWSSessionTokenSecret{ | |
Key: c.Key, | |
Secret: c.Secret, | |
SessionToken: c.SessionToken, | |
}, | |
} | |
} else { | |
connection.Credential = &sourcespb.S3_AccessKey{ | |
AccessKey: &credentialspb.KeySecret{ | |
Key: c.Key, | |
Secret: c.Secret, | |
}, | |
} | |
} | |
} |
The Credential
is sourcespb.S3_Unauthenticated
when we do not pass any of the following flags to trufflehog s3
command:
--key
,--secret
--key
,--secret
,--session-token
--cloud-environment
Providing --role-arn
flag does not affect the Credential
field.
In retrospect, I can understand the confusion - the "unauthenticated" case is only available in TruffleHog Enterprise! However, we should still preserve its existing behavior.
Got it. I have reverted the change in commit 49a8209. The sourcespb.S3_Unauthenticated
case in V2 now matches existing V1 behavior.
unauthenticated-v1.mp4
unauthenticated-v2.mp4
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Update on #4069 (comment). The AWS team has confirmed that this is a bug (aws/aws-sdk-go-v2#3077), and the fix will be included in the next SDK release. |
Fixes aws/aws-sdk-go-v2#3077. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
pkg/sources/s3/s3.go
Outdated
// A valid credentials is required to assume IAM role. aws.AnonymousCredentials is not a valid credentials. | ||
// If the value of credsProvider is aws.AnonymousCredentials{} from the above switch-case, | ||
// we will need to set credsProvider to nil to use SDK's default credential chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the extra info! But my question was actually about the sourcespb.S3_Unauthenticated
case in the switch
statement above, which is not represented by those screen recordings. In retrospect, I can understand the confusion - the "unauthenticated" case is only available in TruffleHog Enterprise! However, we should still preserve its existing behavior.
// account, but the role probably doesn't have access to all of them). This makes it expected behavior | ||
// and therefore not an error. | ||
ctx.Logger().V(3).Info("could not list objects in bucket", "err", err) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell me if I understand this right: When using the V1 SDK, auth errors would occur when calling ListObjectsV2PagesWithContext
, meaning that there would be at most one (per bucket). But after this change, they occur with each individual call to paginator.NextPage
, meaning that we will get one per page, so we have to explicitly break out of the pagination loop. This means that if paginator.NextPage
returns some other error, we'll also break out of the loop, whereas before we'd just try the next page.
This seems like a functional change so I want to ensure I understand it. (It doesn't sound avoidable.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your understanding is correct.
This seems like a functional change so I want to ensure I understand it. (It doesn't sound avoidable.)
Yes, it is unavoidable. In V1, pagination is handled using a callback passed to ListObjectsV2PagesWithContext
. If an error occurs, ListObjectsV2PagesWithContext
returns err
immediately.
trufflehog/pkg/sources/s3/s3.go
Lines 348 to 367 in e42153d
pageNumber := 1 | |
err = regionalClient.ListObjectsV2PagesWithContext( | |
ctx, | |
input, | |
func(page *s3.ListObjectsV2Output, _ bool) bool { | |
pageMetadata := pageMetadata{ | |
bucket: bucket, | |
pageNumber: pageNumber, | |
client: regionalClient, | |
page: page, | |
} | |
processingState := processingState{ | |
errorCount: &errorCount, | |
objectCount: &objectCount, | |
} | |
s.pageChunker(ctx, pageMetadata, processingState, chunksChan) | |
pageNumber++ | |
return true | |
}) |
In V2, pagination is handled using HasMorePages()
and NextPage()
. As mentioned in #4069 (comment), HasMorePages()
does not return false
when the first page returns an error. I use break
instead of return
here to allow scanning to continue with the next bucket, preserving the behavior from V1.
for bucketIdx := pos.index; bucketIdx < bucketsToScanCount; bucketIdx++ {
...
paginator := s3.NewListObjectsV2Paginator(regionalClient, input)
for paginator.HasMorePages() {
output, err := paginator.NextPage(ctx)
if err != nil {
break
}
...
}
}
Reference: #4069 (comment) Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Does that mean that we need to wait for them to release the next version before merging this? |
@rosecodym The fix has already been released in |
oh whoops! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your detailed investigation and work and for accommodating all the back and forth.
i'm seeing a new integration test failure; i'll provide details once i have time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false alarm! the test failure was some transient thing!
* main: [Feat] Added Mux API Analyzer (trufflesecurity#4128) fixed name of netlify analyzer in cli output (trufflesecurity#4140) fix(discordwebhook): Update Discord webhook detector to support 19-digit IDs (trufflesecurity#4133) [Feat] Added New AccuWeather Detector Version (trufflesecurity#4114) [Feat] Added Ngrok API Key Analyzer (trufflesecurity#4110) Improved JDBC Detector Regex (trufflesecurity#4109) [Feat] Detector implementation for Azure Configuration Connection String Key (trufflesecurity#3939) test(sources/s3): fix missing region error (trufflesecurity#4131) feat(sources/s3): migrate to AWS SDK v2 (trufflesecurity#4069) Update PreCommit.md (trufflesecurity#4112) Exclusion of FalsePositive GH's usernames in PrivateKeyDetector (trufflesecurity#4046) Monday App Analyzer (trufflesecurity#4120) [Feat] Detector implementation for Azure API Management Direct Management Key (trufflesecurity#3938) Fastly Analyzer (trufflesecurity#4082) Postman Code Uses Consistent Casing for Id Var Names (trufflesecurity#4124) Normalize UID to Uid in Postman Code (trufflesecurity#4125) postman_client.IDNameUUID becomes IdNameUid (trufflesecurity#4123) Fixed Kontent Detector (trufflesecurity#4122) # Conflicts: # pkg/analyzer/analyzers/analyzers.go # pkg/analyzer/cli.go
Description:
Closes #4054.
This PR migrates the S3 source to use AWS SDK v2 by following the migration guide https://docs.aws.amazon.com/sdk-for-go/v2/developer-guide/migrate-gosdk.html. This brings compatibility improvements and better long-term support from AWS.
Checklist:
make test-community
)?make lint
this requires golangci-lint)?To verify the changes manually, I:
trufflehog-aws-sdk-v2-local-test.mp4